Columnar storage and list-based processing for graph database management systems
نویسندگان
چکیده
We revisit column-oriented storage and query processing techniques in the context of contemporary graph database management systems (GDBMSs). Similar to RDBMSs, GDBMSs support read-heavy analytical workloads that however have fundamentally different data access patterns than traditional workloads. first derive a set desiderata for optimizing processors GDBMS based on their patterns. then present design columnar storage, compression, these desiderata. In addition showing direct integration existing from we also propose novel ones are optimized GDBMSs. These include list-based processor, which avoids expensive copies block-based under many-to-many joins, new structure call single-indexed edge property pages an accompanying ID scheme, application Jacobson's bit vector index compressing NULL values empty lists. integrated our into GraphflowDB in-memory GDBMS. Through extensive experiments, demonstrate scalability performance benefits techniques.
منابع مشابه
Building a Columnar Database on Shared Main Memory-Based Storage
In the field of disk-based parallel database management systems exists a great variety of solutions based on a shared-storage or a shared-nothing architecture. In contrast, main memory-based parallel database management systems are dominated solely by the shared-nothing approach as it preserves the in-memory performance advantage by processing data locally on each server. We argue that this uni...
متن کاملHybrid Storage Management for Database Systems
The use of flash-based solid state drives (SSDs) in storage systems is growing. Adding SSDs to a storage system not only raises the question of how to manage the SSDs, but also raises the question of whether current buffer pool algorithms will still work effectively. We are interested in the use of hybrid storage systems, consisting of SSDs and hard disk drives (HDDs), for database management. ...
متن کاملOntology Based Query Processing in Database Management Systems
The use of semantic knowledge in its various forms has become an important aspect in managing data in database and information systems. In the form of integrity constraints, it has been used intensively in query optimization for some time. Similarly, data integration techniques have utilized semantic knowledge to handle heterogeneity for query processing on distributed information sources in a ...
متن کاملBPP: Large Graph Storage for Efficient Disk Based Processing
Processing very large graphs like social networks, biological and chemical compounds is a challenging task. Distributed graph processing systems process the billion-scale graphs efficiently but incur overheads of efficient partitioning and distribution of the graph over a cluster of nodes. Distributed processing also requires cluster management and fault tolerance. In order to overcome these pr...
متن کاملGraph Database Systems for Genomics
Genome databases have specific requirements which limit the usefulness of some database management systems. By using more appropriate database technology, a database system can be developed for genome data. We have developed a data representation based on graph theory which captures the highly interconnected structure of genome data. Graphs are a language which can be tailored for describing ge...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2021
ISSN: ['2150-8097']
DOI: https://doi.org/10.14778/3476249.3476297